Improved Identification of Differentially Expressed Genes Using Pareto Set based Pruning

نویسندگان

  • Jianjun Hu
  • Jia Xu
چکیده

Identification of differentially expressed genes (DEGs) from microarray datasets is one of the most important analyses for microarray data mining. Popular algorithms such as statistical t-test, fold change, and rank product can be improved by considering other features of differentially expressed genes. We proposed a parameter-free nondominated Pareto set based gene pruning algorithm for pruning non-differentially expressed genes before applying standard DEG identification algorithms. All genes are mapped to a feature space composed of average differences of gene expression and average expression levels and it is observed that differentially expressed genes tend to be located in boundary regions. Experiments on 17 Gene Omnibus Database (GEO) datasets showed that Pareto gene pruning can significantly improve popular algorithms such as t-test, rank product, and fold change in terms of prediction accuracy and AUC values with improvements ranging from 11% to 50% in terms of the number of identified true DEGs.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Identification of key genes and pathways involved in vitiligo vulgaris by gene network analysis

Background and Aim: Vitiligo vulgaris is an acquired, chronic skin and hair condition characterized clinically by loss of melanin, which, if untreated, is typically progressive and irreversible. The aim of the present study was to identify potential genes involved in the pathogenesis of vitiligo. Methods: One dataset of mRNA expression in patients with vitiligo (GSE65127) were obtained from ...

متن کامل

Diagnosis of the disease using an ant colony gene selection method based on information gain ratio using fuzzy rough sets

With the advancement of metagenome data mining science has become focused on microarrays. Microarrays are datasets with a large number of genes that are usually irrelevant to the output class; hence, the process of gene selection or feature selection is essential. So, it follows that you can remove redundant genes and increase the speed and accuracy of classification. After applying the gene se...

متن کامل

Investigating the Function of Predicted Proteins from RNA-Seq Data in Holstein and Cholistani Cattle Breeds

This study was performed to determine the digital expression profile of different genes expressed in Holstein and Cholistani breeds as well as to evaluate the performance of predicted proteins derived from differentially expressed genes between these two breeds using RNA-Seq data. For this purpose, the whole mRNA sequence for a blood sample of American Holstein and Pakistani Cholistani cattle p...

متن کامل

Identification and Functional Prediction of Long Non-Coding RNAs Responsive to Drought stress in Lens culinaris L.

Drought stress is one of the main environmental factors that affects growth and productivity of crop plants, including lentil. In the course of evolution evolution, crucial genetic regulations mediated by non-coding RNAs (ncRNAs) have emerged in plant in response to drought and other abiotic stresses. In the present study, after identifying lncRNAs within the expression profile of lentil, RNA-s...

متن کامل

DYNAMIC PERFORMANCE OPTIMIZATION OF TRUSS STRUCTURES BASED ON AN IMPROVED MULTI-OBJECTIVE GROUP SEARCH OPTIMIZER

This paper presents an improved multi-objective group search optimizer (IMGSO) that is based on Pareto theory that is designed to handle multi-objective optimization problems. The optimizer includes improvements in three areas: the transition-feasible region is used to address constraints, the Dealer’s Principle is used to construct the non-dominated set, and the producer is updated using a tab...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009